{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Geocoding with Python\n",
    "\n",
    "Agenda:\n",
    "- Geocoding addresses to latitude/longitude\n",
    "- Exploring locations with the Google Places API\n",
    "- Reverse geocoding latitude/longitude to an address\n",
    "- Reverse geocoding latitude/longitude to block FIPS code\n",
    "\n",
    "**You'll need a Google API key to use the Google Maps Geocoding API and the Google Places API Web Service:**\n",
    "1. Go to https://console.developers.google.com/project and sign in\n",
    "1. Create a new project and call it cp255, then click create\n",
    "1. On the screen with all the APIs listed, click \"Google Places API Web Service\" under Google Maps APIs, then click the Enable API button\n",
    "1. Go to Credentials and click create credentials, choose API Key\n",
    "1. Copy your API key when it is displayed, then create keys.py with the line `google_maps_api_key='YOUR-KEY-HERE'`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import pandas as pd, requests, time\n",
    "from geopy.geocoders import GoogleV3\n",
    "\n",
    "# import my api key saved in a local file i called keys.py\n",
    "from keys import google_maps_api_key"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# set the pause duration between api requests\n",
    "pause = 0.1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Part 1: Geocoding addresses to lat-long\n",
    "\n",
    "We will use the Google Maps geocoding API. You don't need an API key for this.\n",
    "- Documentation: https://developers.google.com/maps/documentation/geocoding/intro\n",
    "- Example request: http://maps.googleapis.com/maps/api/geocode/json?address=350+5th+Ave,+New+York,+NY+10118&sensor=false"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>address</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>350 5th Ave, New York, NY 10118</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>100 Larkin St, San Francisco, CA 94102</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Wurster Hall, Berkeley, CA</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                  address\n",
       "0         350 5th Ave, New York, NY 10118\n",
       "1  100 Larkin St, San Francisco, CA 94102\n",
       "2              Wurster Hall, Berkeley, CA"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "locations = pd.DataFrame()\n",
    "locations['address'] = ['350 5th Ave, New York, NY 10118',\n",
    "                        '100 Larkin St, San Francisco, CA 94102',\n",
    "                        'Wurster Hall, Berkeley, CA']\n",
    "locations"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# function that accepts an address string, sends it to the Google API, and returns the lat-long API result\n",
    "def geocode(address):\n",
    "    time.sleep(pause) #pause for some duration before each request, to not hammer their server\n",
    "    url = 'http://maps.googleapis.com/maps/api/geocode/json?address={}&sensor=false' #api url with placeholders\n",
    "    request = url.format(address) #fill in the placeholder with a variable\n",
    "    response = requests.get(request) #send the request to the server and get the response\n",
    "    data = response.json() #convert the response json string into a dict\n",
    "    \n",
    "    if len(data['results']) > 0: #if google was able to geolocate our address, extract lat-long from result\n",
    "        latitude = data['results'][0]['geometry']['location']['lat']\n",
    "        longitude = data['results'][0]['geometry']['location']['lng']\n",
    "        return '{},{}'.format(latitude, longitude) #return lat-long as a string in the format google likes"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>address</th>\n",
       "      <th>latlng</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>350 5th Ave, New York, NY 10118</td>\n",
       "      <td>40.7487097,-73.9856556</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>100 Larkin St, San Francisco, CA 94102</td>\n",
       "      <td>37.7791156,-122.4157662</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Wurster Hall, Berkeley, CA</td>\n",
       "      <td>37.8707352,-122.2548935</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                  address                   latlng\n",
       "0         350 5th Ave, New York, NY 10118   40.7487097,-73.9856556\n",
       "1  100 Larkin St, San Francisco, CA 94102  37.7791156,-122.4157662\n",
       "2              Wurster Hall, Berkeley, CA  37.8707352,-122.2548935"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# for each value in the address column, geocode it, save results as new df column\n",
    "locations['latlng'] = locations['address'].map(geocode)\n",
    "locations"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>address</th>\n",
       "      <th>latlng</th>\n",
       "      <th>latitude</th>\n",
       "      <th>longitude</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>350 5th Ave, New York, NY 10118</td>\n",
       "      <td>40.7487097,-73.9856556</td>\n",
       "      <td>40.7487097</td>\n",
       "      <td>-73.9856556</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>100 Larkin St, San Francisco, CA 94102</td>\n",
       "      <td>37.7791156,-122.4157662</td>\n",
       "      <td>37.7791156</td>\n",
       "      <td>-122.4157662</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Wurster Hall, Berkeley, CA</td>\n",
       "      <td>37.8707352,-122.2548935</td>\n",
       "      <td>37.8707352</td>\n",
       "      <td>-122.2548935</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                  address                   latlng  \\\n",
       "0         350 5th Ave, New York, NY 10118   40.7487097,-73.9856556   \n",
       "1  100 Larkin St, San Francisco, CA 94102  37.7791156,-122.4157662   \n",
       "2              Wurster Hall, Berkeley, CA  37.8707352,-122.2548935   \n",
       "\n",
       "     latitude     longitude  \n",
       "0  40.7487097   -73.9856556  \n",
       "1  37.7791156  -122.4157662  \n",
       "2  37.8707352  -122.2548935  "
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# parse the result into separate lat and lon columns for easy mapping\n",
    "locations['latitude'] = locations['latlng'].map(lambda x: x.split(',')[0])\n",
    "locations['longitude'] = locations['latlng'].map(lambda x: x.split(',')[1])\n",
    "locations"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Part 2: Google Places API\n",
    "\n",
    "We will use Google's Places API to look up places in the vicinity of some location. You need an API key for this.\n",
    "- Documentation: https://developers.google.com/places/\n",
    "- Example request: https://maps.googleapis.com/maps/api/place/search/json?keyword=coffee&location=37.8683811,-122.2589063&radius=1000&sensor=false&key=YOUR-KEY-HERE"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'37.8707352,-122.2548935'"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# google places api url, with placeholders\n",
    "url = 'https://maps.googleapis.com/maps/api/place/search/json?keyword={}&location={}&radius={}&key={}&sensor=false'\n",
    "\n",
    "# what keyword to search for\n",
    "keyword = 'restaurant'\n",
    "\n",
    "# define the radius (in meters) for the search\n",
    "radius = 1000\n",
    "\n",
    "# define the location coordinates (of wurster hall)\n",
    "location = locations.loc[2, 'latlng']\n",
    "location"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# add our variables into the url, submit the request to the api, and load the response\n",
    "request = url.format(keyword, location, radius, google_maps_api_key)\n",
    "response = requests.get(request)\n",
    "data = response.json()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>geometry</th>\n",
       "      <th>rating</th>\n",
       "      <th>vicinity</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Freehouse</td>\n",
       "      <td>{'location': {'lat': 37.869081, 'lng': -122.25...</td>\n",
       "      <td>4.0</td>\n",
       "      <td>2700 Bancroft Way, Berkeley</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Finfiné Ethiopian Restaurant</td>\n",
       "      <td>{'location': {'lat': 37.8639074, 'lng': -122.2...</td>\n",
       "      <td>4.4</td>\n",
       "      <td>2556 Telegraph Avenue, Berkeley</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Saturn Cafe</td>\n",
       "      <td>{'location': {'lat': 37.8697714, 'lng': -122.2...</td>\n",
       "      <td>4.2</td>\n",
       "      <td>2175 Allston Way, Berkeley</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Subway</td>\n",
       "      <td>{'location': {'lat': 37.8625827, 'lng': -122.2...</td>\n",
       "      <td>3.3</td>\n",
       "      <td>2618 Telegraph Avenue, Berkeley</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Subway</td>\n",
       "      <td>{'location': {'lat': 37.8752456, 'lng': -122.2...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2509 Hearst Avenue, Berkeley</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                           name  \\\n",
       "0                     Freehouse   \n",
       "1  Finfiné Ethiopian Restaurant   \n",
       "2                   Saturn Cafe   \n",
       "3                        Subway   \n",
       "4                        Subway   \n",
       "\n",
       "                                            geometry  rating  \\\n",
       "0  {'location': {'lat': 37.869081, 'lng': -122.25...     4.0   \n",
       "1  {'location': {'lat': 37.8639074, 'lng': -122.2...     4.4   \n",
       "2  {'location': {'lat': 37.8697714, 'lng': -122.2...     4.2   \n",
       "3  {'location': {'lat': 37.8625827, 'lng': -122.2...     3.3   \n",
       "4  {'location': {'lat': 37.8752456, 'lng': -122.2...     NaN   \n",
       "\n",
       "                          vicinity  \n",
       "0      2700 Bancroft Way, Berkeley  \n",
       "1  2556 Telegraph Avenue, Berkeley  \n",
       "2       2175 Allston Way, Berkeley  \n",
       "3  2618 Telegraph Avenue, Berkeley  \n",
       "4     2509 Hearst Avenue, Berkeley  "
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "places = pd.DataFrame(data['results'])\n",
    "places = places[['name', 'geometry', 'rating', 'vicinity']]\n",
    "places.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>latitude</th>\n",
       "      <th>longitude</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>37.869081</td>\n",
       "      <td>-122.254349</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>37.863907</td>\n",
       "      <td>-122.258973</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>37.869771</td>\n",
       "      <td>-122.266004</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>37.862583</td>\n",
       "      <td>-122.258960</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>37.875246</td>\n",
       "      <td>-122.259667</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    latitude   longitude\n",
       "0  37.869081 -122.254349\n",
       "1  37.863907 -122.258973\n",
       "2  37.869771 -122.266004\n",
       "3  37.862583 -122.258960\n",
       "4  37.875246 -122.259667"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# parse out lat-long and return it as a series -> this creates a dataframe of all the results when you .apply()\n",
    "def parse_coords(geometry):\n",
    "    if isinstance(geometry, dict):\n",
    "        lng = geometry['location']['lng']\n",
    "        lat = geometry['location']['lat']\n",
    "        return pd.Series({'latitude':lat, 'longitude':lng})\n",
    "    \n",
    "# test our function\n",
    "places['geometry'].head().apply(parse_coords)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>rating</th>\n",
       "      <th>vicinity</th>\n",
       "      <th>latitude</th>\n",
       "      <th>longitude</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>Kiraku</td>\n",
       "      <td>4.6</td>\n",
       "      <td>2566 Telegraph Avenue, Berkeley</td>\n",
       "      <td>37.863684</td>\n",
       "      <td>-122.258986</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>KoJa Kitchen</td>\n",
       "      <td>4.6</td>\n",
       "      <td>2395 Telegraph Avenue, Berkeley</td>\n",
       "      <td>37.867121</td>\n",
       "      <td>-122.258619</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>Musashi</td>\n",
       "      <td>4.6</td>\n",
       "      <td>2126 Dwight Way, Berkeley</td>\n",
       "      <td>37.863992</td>\n",
       "      <td>-122.266524</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>Remy's</td>\n",
       "      <td>4.4</td>\n",
       "      <td>2506 Haste Street, Berkeley</td>\n",
       "      <td>37.865924</td>\n",
       "      <td>-122.258037</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Finfiné Ethiopian Restaurant</td>\n",
       "      <td>4.4</td>\n",
       "      <td>2556 Telegraph Avenue, Berkeley</td>\n",
       "      <td>37.863907</td>\n",
       "      <td>-122.258973</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                            name  rating                         vicinity  \\\n",
       "17                        Kiraku     4.6  2566 Telegraph Avenue, Berkeley   \n",
       "14                  KoJa Kitchen     4.6  2395 Telegraph Avenue, Berkeley   \n",
       "13                       Musashi     4.6        2126 Dwight Way, Berkeley   \n",
       "15                        Remy's     4.4      2506 Haste Street, Berkeley   \n",
       "1   Finfiné Ethiopian Restaurant     4.4  2556 Telegraph Avenue, Berkeley   \n",
       "\n",
       "     latitude   longitude  \n",
       "17  37.863684 -122.258986  \n",
       "14  37.867121 -122.258619  \n",
       "13  37.863992 -122.266524  \n",
       "15  37.865924 -122.258037  \n",
       "1   37.863907 -122.258973  "
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# now run our function on the whole dataframe and SAVE THE OUTPUT to 2 new dataframe columns\n",
    "places[['latitude', 'longitude']] = places['geometry'].apply(parse_coords)\n",
    "places_clean = places.drop('geometry', axis=1)\n",
    "places_clean.sort_values(by='rating', ascending=False).head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Part 3: Reverse geocoding (address lookup)\n",
    "\n",
    "We'll use Google's reverse geocoding API.\n",
    "- Documentation: https://developers.google.com/maps/documentation/geocoding/intro#ReverseGeocoding\n",
    "- Example request: https://maps.googleapis.com/maps/api/geocode/json?latlng=34.537094,-82.630303\n",
    "\n",
    "You can do this manually, just like in the previous two sections, but it's a little more complicated to parse Google's address components results. If we just want addresses, we can use [geopy](https://geopy.readthedocs.org/en/release-0.96.3/#geopy.geocoders.GoogleV3) to simply call Google's API automatically for us."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>latitude</th>\n",
       "      <th>longitude</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>34.537094</td>\n",
       "      <td>-82.630303</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>35.025700</td>\n",
       "      <td>-78.970500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>39.151817</td>\n",
       "      <td>-77.163810</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>38.636738</td>\n",
       "      <td>-121.319550</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>47.616955</td>\n",
       "      <td>-122.348921</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    latitude   longitude\n",
       "0  34.537094  -82.630303\n",
       "1  35.025700  -78.970500\n",
       "2  39.151817  -77.163810\n",
       "3  38.636738 -121.319550\n",
       "4  47.616955 -122.348921"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# load usa point data and keep only the first 5 rows\n",
    "usa = pd.read_csv('data/usa-latlong.csv')\n",
    "usa = usa.head()\n",
    "usa"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>latitude</th>\n",
       "      <th>longitude</th>\n",
       "      <th>latlng</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>34.537094</td>\n",
       "      <td>-82.630303</td>\n",
       "      <td>34.537094,-82.630303</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>35.025700</td>\n",
       "      <td>-78.970500</td>\n",
       "      <td>35.0257,-78.9705</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>39.151817</td>\n",
       "      <td>-77.163810</td>\n",
       "      <td>39.151817,-77.16381</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>38.636738</td>\n",
       "      <td>-121.319550</td>\n",
       "      <td>38.636738,-121.31955</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>47.616955</td>\n",
       "      <td>-122.348921</td>\n",
       "      <td>47.616955,-122.34892099999999</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    latitude   longitude                         latlng\n",
       "0  34.537094  -82.630303           34.537094,-82.630303\n",
       "1  35.025700  -78.970500               35.0257,-78.9705\n",
       "2  39.151817  -77.163810            39.151817,-77.16381\n",
       "3  38.636738 -121.319550           38.636738,-121.31955\n",
       "4  47.616955 -122.348921  47.616955,-122.34892099999999"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# create a column to put lat-long into the format google likes - this just makes it easier to call their API\n",
    "usa['latlng'] = usa.apply(lambda row: '{},{}'.format(row['latitude'], row['longitude']), axis=1)\n",
    "usa"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>latitude</th>\n",
       "      <th>longitude</th>\n",
       "      <th>latlng</th>\n",
       "      <th>address</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>34.537094</td>\n",
       "      <td>-82.630303</td>\n",
       "      <td>34.537094,-82.630303</td>\n",
       "      <td>111 Simpson Rd, Anderson, SC 29621, USA</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>35.025700</td>\n",
       "      <td>-78.970500</td>\n",
       "      <td>35.0257,-78.9705</td>\n",
       "      <td>5340 Sumac Cir, Fayetteville, NC 28304, USA</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>39.151817</td>\n",
       "      <td>-77.163810</td>\n",
       "      <td>39.151817,-77.16381</td>\n",
       "      <td>8200 Spiceberry Cir, Gaithersburg, MD 20877, USA</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>38.636738</td>\n",
       "      <td>-121.319550</td>\n",
       "      <td>38.636738,-121.31955</td>\n",
       "      <td>7836-7840 Fair Oaks Blvd, Carmichael, CA 95608...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>47.616955</td>\n",
       "      <td>-122.348921</td>\n",
       "      <td>47.616955,-122.34892099999999</td>\n",
       "      <td>225 Cedar St, Seattle, WA 98121, USA</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    latitude   longitude                         latlng  \\\n",
       "0  34.537094  -82.630303           34.537094,-82.630303   \n",
       "1  35.025700  -78.970500               35.0257,-78.9705   \n",
       "2  39.151817  -77.163810            39.151817,-77.16381   \n",
       "3  38.636738 -121.319550           38.636738,-121.31955   \n",
       "4  47.616955 -122.348921  47.616955,-122.34892099999999   \n",
       "\n",
       "                                             address  \n",
       "0            111 Simpson Rd, Anderson, SC 29621, USA  \n",
       "1        5340 Sumac Cir, Fayetteville, NC 28304, USA  \n",
       "2   8200 Spiceberry Cir, Gaithersburg, MD 20877, USA  \n",
       "3  7836-7840 Fair Oaks Blvd, Carmichael, CA 95608...  \n",
       "4               225 Cedar St, Seattle, WA 98121, USA  "
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# tell geopy to reverse geocode some lat-long string using Google's API and return the address\n",
    "def reverse_geopy(latlng):\n",
    "    time.sleep(pause)\n",
    "    geolocator = GoogleV3()\n",
    "    address, _ = geolocator.reverse(latlng, exactly_one=True)\n",
    "    return address\n",
    "\n",
    "usa['address'] = usa['latlng'].map(reverse_geopy)\n",
    "usa"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### What if you just want the city or state?\n",
    "You could try to parse the address strings, but you're relying on them always having a consistent format. This might not be the case if you have international location data. In this case, you should call the API manually and extract the individual address components you are interested in."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# pass the Google API latlng data to reverse geocode it\n",
    "def reverse_geocode(latlng):\n",
    "    time.sleep(pause)\n",
    "    url = 'https://maps.googleapis.com/maps/api/geocode/json?latlng={}'\n",
    "    request = url.format(latlng)\n",
    "    response = requests.get(request)\n",
    "    data = response.json()\n",
    "    if len(data['results']) > 0:\n",
    "        return data['results'][0] #if we got results, return the first result\n",
    "    \n",
    "geocode_results = usa['latlng'].map(reverse_geocode)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now look inside each reverse geocode result to see if address_components exists. If it does, look inside each component to see if we can find the city or the state. Google calls the city name by the abstract term 'locality' and the state name by the abstract term 'administrative_area_level_1' ...this just lets them use the same terminology anywhere in the world."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "def get_city(geocode_result):\n",
    "     if 'address_components' in geocode_result:\n",
    "        for address_component in geocode_result['address_components']:\n",
    "            if 'locality' in address_component['types']:\n",
    "                return address_component['long_name']\n",
    "                \n",
    "def get_state(geocode_result):\n",
    "     if 'address_components' in geocode_result:\n",
    "        for address_component in geocode_result['address_components']:\n",
    "            if 'administrative_area_level_1' in address_component['types']:\n",
    "                return address_component['long_name']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>latitude</th>\n",
       "      <th>longitude</th>\n",
       "      <th>latlng</th>\n",
       "      <th>address</th>\n",
       "      <th>city</th>\n",
       "      <th>state</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>34.537094</td>\n",
       "      <td>-82.630303</td>\n",
       "      <td>34.537094,-82.630303</td>\n",
       "      <td>111 Simpson Rd, Anderson, SC 29621, USA</td>\n",
       "      <td>Anderson</td>\n",
       "      <td>South Carolina</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>35.025700</td>\n",
       "      <td>-78.970500</td>\n",
       "      <td>35.0257,-78.9705</td>\n",
       "      <td>5340 Sumac Cir, Fayetteville, NC 28304, USA</td>\n",
       "      <td>Fayetteville</td>\n",
       "      <td>North Carolina</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>39.151817</td>\n",
       "      <td>-77.163810</td>\n",
       "      <td>39.151817,-77.16381</td>\n",
       "      <td>8200 Spiceberry Cir, Gaithersburg, MD 20877, USA</td>\n",
       "      <td>Gaithersburg</td>\n",
       "      <td>Maryland</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>38.636738</td>\n",
       "      <td>-121.319550</td>\n",
       "      <td>38.636738,-121.31955</td>\n",
       "      <td>7836-7840 Fair Oaks Blvd, Carmichael, CA 95608...</td>\n",
       "      <td>Carmichael</td>\n",
       "      <td>California</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>47.616955</td>\n",
       "      <td>-122.348921</td>\n",
       "      <td>47.616955,-122.34892099999999</td>\n",
       "      <td>225 Cedar St, Seattle, WA 98121, USA</td>\n",
       "      <td>Seattle</td>\n",
       "      <td>Washington</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    latitude   longitude                         latlng  \\\n",
       "0  34.537094  -82.630303           34.537094,-82.630303   \n",
       "1  35.025700  -78.970500               35.0257,-78.9705   \n",
       "2  39.151817  -77.163810            39.151817,-77.16381   \n",
       "3  38.636738 -121.319550           38.636738,-121.31955   \n",
       "4  47.616955 -122.348921  47.616955,-122.34892099999999   \n",
       "\n",
       "                                             address          city  \\\n",
       "0            111 Simpson Rd, Anderson, SC 29621, USA      Anderson   \n",
       "1        5340 Sumac Cir, Fayetteville, NC 28304, USA  Fayetteville   \n",
       "2   8200 Spiceberry Cir, Gaithersburg, MD 20877, USA  Gaithersburg   \n",
       "3  7836-7840 Fair Oaks Blvd, Carmichael, CA 95608...    Carmichael   \n",
       "4               225 Cedar St, Seattle, WA 98121, USA       Seattle   \n",
       "\n",
       "            state  \n",
       "0  South Carolina  \n",
       "1  North Carolina  \n",
       "2        Maryland  \n",
       "3      California  \n",
       "4      Washington  "
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# now map our functions to extract city and state names\n",
    "usa['city'] = geocode_results.map(get_city)                \n",
    "usa['state'] = geocode_results.map(get_state)\n",
    "usa"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Part 4: Reverse geocoding to FIPS\n",
    "\n",
    "We'll use the FCC's (very slow) Census Block Conversions API to turn lat/long into a block FIPS code. FIPS codes contain from left to right: the location's 2-digit state code, 3-digit county code, 6-digit census tract code, and 4-digit census block code (the first digit of which is the census block group code). Now you can join your data to tract (etc) level census data without doing a spatial join.\n",
    "\n",
    "- Documentation: https://www.fcc.gov/developers/census-block-conversions-api\n",
    "- Example request: http://data.fcc.gov/api/block/find?format=json&latitude=37.861055&longitude=-122.256463"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# pass the FCC API lat/long and get FIPS data back - return block fips and county name\n",
    "def get_fips(row):\n",
    "    time.sleep(pause)\n",
    "    url = 'http://data.fcc.gov/api/block/find?format=json&latitude={}&longitude={}'\n",
    "    request = url.format(row['latitude'], row['longitude'])\n",
    "    response = requests.get(request)\n",
    "    data = response.json()\n",
    "    \n",
    "    # return multiple values as a series - this will create a dataframe with multiple columns\n",
    "    return pd.Series({'fips_code':data['Block']['FIPS'], 'county':data['County']['name']})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>latitude</th>\n",
       "      <th>longitude</th>\n",
       "      <th>latlng</th>\n",
       "      <th>address</th>\n",
       "      <th>city</th>\n",
       "      <th>state</th>\n",
       "      <th>county</th>\n",
       "      <th>fips_code</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>34.537094</td>\n",
       "      <td>-82.630303</td>\n",
       "      <td>34.537094,-82.630303</td>\n",
       "      <td>111 Simpson Rd, Anderson, SC 29621, USA</td>\n",
       "      <td>Anderson</td>\n",
       "      <td>South Carolina</td>\n",
       "      <td>Anderson</td>\n",
       "      <td>450070112021073</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>35.025700</td>\n",
       "      <td>-78.970500</td>\n",
       "      <td>35.0257,-78.9705</td>\n",
       "      <td>5340 Sumac Cir, Fayetteville, NC 28304, USA</td>\n",
       "      <td>Fayetteville</td>\n",
       "      <td>North Carolina</td>\n",
       "      <td>Cumberland</td>\n",
       "      <td>370510019034012</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>39.151817</td>\n",
       "      <td>-77.163810</td>\n",
       "      <td>39.151817,-77.16381</td>\n",
       "      <td>8200 Spiceberry Cir, Gaithersburg, MD 20877, USA</td>\n",
       "      <td>Gaithersburg</td>\n",
       "      <td>Maryland</td>\n",
       "      <td>Montgomery</td>\n",
       "      <td>240317007104010</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>38.636738</td>\n",
       "      <td>-121.319550</td>\n",
       "      <td>38.636738,-121.31955</td>\n",
       "      <td>7836-7840 Fair Oaks Blvd, Carmichael, CA 95608...</td>\n",
       "      <td>Carmichael</td>\n",
       "      <td>California</td>\n",
       "      <td>Sacramento</td>\n",
       "      <td>060670078012000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>47.616955</td>\n",
       "      <td>-122.348921</td>\n",
       "      <td>47.616955,-122.34892099999999</td>\n",
       "      <td>225 Cedar St, Seattle, WA 98121, USA</td>\n",
       "      <td>Seattle</td>\n",
       "      <td>Washington</td>\n",
       "      <td>King</td>\n",
       "      <td>530330080011008</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    latitude   longitude                         latlng  \\\n",
       "0  34.537094  -82.630303           34.537094,-82.630303   \n",
       "1  35.025700  -78.970500               35.0257,-78.9705   \n",
       "2  39.151817  -77.163810            39.151817,-77.16381   \n",
       "3  38.636738 -121.319550           38.636738,-121.31955   \n",
       "4  47.616955 -122.348921  47.616955,-122.34892099999999   \n",
       "\n",
       "                                             address          city  \\\n",
       "0            111 Simpson Rd, Anderson, SC 29621, USA      Anderson   \n",
       "1        5340 Sumac Cir, Fayetteville, NC 28304, USA  Fayetteville   \n",
       "2   8200 Spiceberry Cir, Gaithersburg, MD 20877, USA  Gaithersburg   \n",
       "3  7836-7840 Fair Oaks Blvd, Carmichael, CA 95608...    Carmichael   \n",
       "4               225 Cedar St, Seattle, WA 98121, USA       Seattle   \n",
       "\n",
       "            state      county        fips_code  \n",
       "0  South Carolina    Anderson  450070112021073  \n",
       "1  North Carolina  Cumberland  370510019034012  \n",
       "2        Maryland  Montgomery  240317007104010  \n",
       "3      California  Sacramento  060670078012000  \n",
       "4      Washington        King  530330080011008  "
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# get block fips code and county name from FCC as new dataframe, then concatenate to join them\n",
    "fips = usa.apply(get_fips, axis=1)\n",
    "usa = pd.concat([usa, fips], axis=1)\n",
    "usa"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python [default]",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
